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1.  Introduction 


The  problem  of  classifying  an  observation  into  one  of  two  multi¬ 
variate  normal  populations  with  a  common  covariance  matrix  might  be 
called  the  classical  classification  problem.  Fisher’s  linear  discri¬ 
minant  function  [Fisher  (1936)]  serves  as  a  criterion  when  samples  are 
used  to  estimate  the; parameters  of  the  two  distributions.  The  exact 
probabilities  of  misclassif ications  when  using  this  criterion  are 
difficult  to  compute  because  the  distribution  of  the  criterion  is 
virtually  intractable.  Wald  (1944)  made  considerable  progress  towards 
finding  the  distribution,  but  only  managed  to  express  the  criterion  as 
a  function  of  three  angles  whose  distribution  he  gave.  T.  W.  Anderson 
(1951)  and  Rosedith  Sitgreaves  (1952)  continued  with  the  problem.  For 
further  references  see  T.  W.  Anderson,  Das  Gupta,  and  Styan  (1972), 

Subject  Matter  Code  6.2. 

If  the  parameters  are  known,  the  Neyman-Pearson  Fundamental  Lemma 
can  be  applied  to  the  classical  classification  problem  [as  done  by  Wald 
(1944)]  to  obtain  a  discriminant  function  that  is  linear  in  the  components 
of  the  vector  to  be  classified.  The  distribution  of  this  statistic  is 
normal;  the  mean  and  variance  depends  only  on  the  Mahalanobis  distance 
between  the  two  populations.  Since  the  procedure  for  classification  is 
to  classify  into  one  population  or  the  other  depending  on  whether  this 
statistic  is  greater  or  less  than  a  constant,  the  probabilities  of  mis- 
classification  are  found  directly  from  the  normal  distribution.  If  the 
constant  is  0,  the  probabilities  are  equal  and  the  procedure  is  minimax. 

*This  paper  was  presented  to  the  NATO  Advanced  Study  Institute  on 
Discriminant  Analysis  and  Applications  on  June  12,  1972,  at  Kifissia,  Greece. 
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When  the  parameters  are  unknown  and  there  is  available  a  sample  from 
each  population,  the  mean  of  each  population  is  estimated  by  the  mean  of 
the  respective  sample  and  the  common  covariance  matrix  of "Ifctie.  populations 
is  estimated  by  using  deviations  from  the  respective  means  in  the  two 
samples.  The  classification  function  W,  proposed  by  T.  W.  Anderson 
(1951),  is  obtained  by  replacing  the  parameters  in  the  linear  function 


resulting  from  the  Neyman-Pearson  Fundamental  Lemma  by  the  estimates; 
the  substitution  for  parameters  has  been  called  "plugging  in"  estimates. 
This  criterion  differs  from  Fisher's  discriminant  function  by  sub¬ 
traction  of  the  average  of  the  Fisher  discriminant  function  at  the  two 
sample  means.  Then  the  distribution  depends  only  on  the  population 
distance,  and  this  fact  makes  the  distribution  problem  simpler  [T.  W. 
Anderson  (1951)  and  Sitgreaves  (1952)],  though  it  is  still  rather 
intractable. 


uC'T 


When  the  sizes  of  the  two  samples  increase,  the  limiting  distri¬ 
bution  of  W  approaches  a  normal  distribution,  whose  mean  and  variance 


depend  on  the  Mahalanobis  distance;  if  the  limiting  mean  is  subtracted 
from  W  and  the  difference  is  divided  by  the  limiting  standard  devia¬ 
tion,  the  statistic  has  the  standard  normal  distribution  as  its  limiting 
distribution.  Bowker  and  Sitgreaves  (1961)  and  Okamoto  (1963)  with  correction 
(1968)  have  given  asymptotic  expansions  of  the  distributions  to  the  order 
of  the  reciprocal  of  the  square  of  the  sample  sizes.  The  approximate 
probability  depends  on  the  unknown  parameter  (the  distance). 

The  "Studentized"  W  statistic  is  W  less  the  estimate  of  its 
limiting  mean  divided  by  the  estimate  of  its  limiting  standard  devia¬ 


tion.  It,  too,  has  the  standard  normal  distribution  as  its  limiting 
distribution.  If  a  statistician  wants  to  set  his  cut-off  point  to 


achieve  a  specified  probability  of  misclassif ication,  he  can  use  this 
Studentized  W.  An  asymptotic  expansion  of  the  distribution  of  this 
statistic  has  been  given  by  T.  W.  Anderson  (1972). 

In  this  paper  we  compare  these  two  approximations  to  the  probab¬ 
ilities  of  misclassif ication  and  their  uses.  For  further  discussion 
of  the  classification  problem  see  Anderson  (1958),  Chapter  6. 


2.  The  asymptotic  expansion  of  the  distribution  of  the  classification 
sfafTsfic  W 


Let  the  two  populations  be  N(y^,  E)  and  N(y^,  £),  an(j  2.et  the 
two  samples  be  x5^ ,  ...  ,  x^^  and  x5^,  ...  ,  respectively. 

The  observation  to  be  classified  is  x,  which  has  the  distribution 
N(y,  E),  where  y  =  y^  or  y  =  y^\  The  classification  statistic 
W  is 


(1) 

where 

(2) 


W  =  (x(1)  -  x(2V  S_1  (x  -  j  (x(1)  +  x(2))]  , 


N1 

J(l)  .  i_  J  X  D  ,  x(2) 

R1  j-1'3 


N2 

-  I  x!2) 
2  j=l 


N, 


(3)  nS  =  l  (xf1)-I(1))(x?1)-x(1))*  +  I  (x.(2)-x(2))(xf2)-x(2))’ 

1  J  J  -]=!  J  J 


and  n  =  -  2.  The  rule  is  to  classify  x  as  coming  from 

N(y^,  E)  if  W  >  c  and  from  N(y^2^,  E)  if  W  _<  c,  where  c  may 
be  a  constant,  particularly  0,  or  a  function  of  x^\  x^2\  and  S. 


The  squared  Mahalanobis  distance  is 


(4) 


a  =  (y (1)  -  y(2))’  Jf1  (y(1)  -  y(2))  , 


which  can  be  estimated  by 

(5)  a  =  (x(1)  -I(2))’  S-1  (x(1)  -  x(2)) 


The  limiting  distribution  of  W  as  °°  and  -»•  1:0  is  normal  with 

variance  a  and  mean  \  a  if  x  is  from  N(y^,  E)  and  mean  -  ^  a 
if  x  is  from  N(y  ,  E);  that  is,  the  standard  normal  distribution 
N(0,  1)  is  the  limiting  distribution  of  (W  -  —  a)/Ja  for  x  coming 
from  N(y^\  E)  and  of  (W  +  y  a )/Ja  for  x  coming  from  N(y^2\  E). 

Okamoto’s  expansion  of  the  probability  distribution  [(1963), 
Corollary  1]  to  terms  of  order  h  is 

Cl) 


|w  -  \  A2 

(6)  Pr  l - 4 -  <  u 


A 


y=y 


=  $(u)  + 


a  + 1  k  + 1 1)  szl  +  3£ii  ,ii  +  qn 

U  +  2k  2k}  ,2+  2  +  2  k  +  4~]  U 


-A 


a  +  h  - 2  +  1 

K  A 


2 

u  - 


U 


+  0(n  2)  , 


where  k  =  lim^^  as  Nx  +  00  and  N2  -*•  °°,  A2  =  a,  and  $(  )  and 

<}>  (  )  are  the  cumulative  distribution  function  and  density  of  N(0,  1), 


respectively.  If,  lii^  N  /N2  =  1»  then 


JW  -  j  A2 

(7)  Pr  { - - - <_  u 

A 

-  2 

'  A' 


y-P(1),  u.£ 

nHK» 


=  *(u)  +-(j)(u)  a|\  -  ^ 

n  LA2  2  J 


1 

ll  2 

[2  .  ; 

L  u  - 

“j 

1 

J 

U2  - 

+  0(n“2) 


-  4  - 


A 


=  $(u)  +  —  4>  (u) 
n 


(A+u) (1-u2)  -  ^ 


P4+  3e+1  + 


u(,  +  0(n  2) 


The  relation  between  the  cut-off  point  c  and  the  argument  u  is 
(8)  c  =  uA  4-  A2  ,  u  = 


1  A2 

c  -  jA 


A 


The  probability  of  misclassif ication  when  x  is  from  N(y^\  E)  is 
(6)  [or  (7)]  with  u  given  by  (8);  the  probability  depends  importantly 
on  the  parameter 

A  cut-off  point  of  particular  interest  is  c  =  0,  which  corresponds 
to  u  =  -  —  A.  If  =  N^,  this  defines  a  minimax  procedure.  In  this 
case  the  probability  of  misclassif ication  is 


(9)  Pr  < 

W  <  0 

y=y(1),  lim  ^  =  1 

►  =  $(-  f)  <J>(f)  < 

L 

n*30  2 

L  J 

+  0(n  2)  . 


As  far  as  this  approximation  goes,  the  correction  term  is  positive; 
that  is,  the  probability  of  a  misclassif ication  error  is  greater  than 
the  value  of  the  normal  approximation.  For  a  given  value  of  A  the 
correction  term  and  hence  the  probability  (to  order  n  increases 
with  p.  For  a  given  value  of  p  the  probability  (to  order  n 
decreases  with  A. 

Okamoto  (as  well  as  Bowker  and  Sitgreaves)  expanded  the  character¬ 
istic  function.  The  method  of  Anderson  (1972)  could  be  used  to  obtain 
the  result. 
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3.  The  asymptotic  expansion  of  the  distribution  of  the  Studentized  W 


To  use  the  approximate  probability  given  by  (6)  one  must  know  the 
2 

parameter  a  =  A  ,  but  this  is  generally  unknown;  then  the  statistician 
cannot  achieve,  even  approximately,  a  desired  probability.  However,  he 
can  use  the  fact  that  a  is  a  consistant  estimate  of  a  and  therefore 
(W  -  a)//a  and  (W  +  y  a)//a  have  N(0,  1)  as  the  limiting  distri¬ 

bution  in  cases  y  =  y^  and  y  =  y^2\  respectively. 

We  can  write 


(10) 

Then 


W-±a 


-x<2>)’  S'1  (*  -  x(1)) 


JM-ia 

(H)  Pr  < - - —  <  u) 

^  " 


Pr-((x^  -  x(2V  S  1  (x-y) 


u) 


Since  x  has  the  distribution  N(y,  S)  independently  of  x^,  x^2\ 


and  S,  the  conditional  distribution  of  (x^-x^2^)*S  ^(x-y) 
N[0,  (x(1')-x(2)),S_12S~1(x^1)-x(2))]  ,  and 


is 


(12) 


(x(1)-x(2))’s  1(x-y) 


r  = 


has  the  distribution  N(0,  1).  Then  (11)  is 
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jW-ja 

(13)  Pr< - —  <  u>  =  Pr  <r  < 

SI  - 


J  (x(1)-x(2)),S~1(x(1)-x(2^-(x(;l)-x('2')),S~1(x(1)-lJ)' 


/(x(1)-x(2))V1ES-1(x(1)-x(2>) 


= 


u  /^Vs"1^)  +  (x^-x^'s"1^- 


(x(1)-x(2))’S  1ZS~1(x(1)-x(2)) 


-y) 


where  the  expectation  is  with  respect  to  x^\  x^2\  and  S. 

When  y  =  y^\  x^  -  x^2\  x^  -  y,  and  S  converge  in  probability 
to  y ^  -  y^2\  0,  and  E,  respectively.  We  can  expand  the  argument  of 


$(  )  in  a  Taylor's  series  in  terms  of  Sn  times  the  differences  between 
the  estimates  and  their  probability  limits.  When  the  expansion  includes 
third  degree  terms  and  the  expecations  computed,  the  result  is 


(14)  Pr<^-^  <  u 


SI 


u-p(1)^  - 


=  *(u)  +  i  <J)(u)  (1+k)  -  (p  -  ~  +  j  k)u  -  i  U3J  +  0(n  2) 


Interchanging  and  gives 


(15)  Pr 


k  +  2  a 


si 


<  v 


y=y(27  =  $(v)  -  ±  <j)(v) 


+  0(n  2)  . 


(i  +  &  +  (p 


1  ,  1  \  .  1  31 

4  +  2k)v  +  4  V 


The  proof  of  these  results  was  given  by  T.  W.  Anderson  (1972).  If 
llaW  Nl/N2  =  k  = 


(16) 


Pr<—  <  u 

✓7  “ 


_  (1)  , .  \ 

y  y  ,  lxm  ^ 

n-*»  2 


1>  =  $(u)  +  ^  4>(uW2  A 


£zi  _ 


,  ,  Is  13 

(P  +  4)u  “  4  u 


+  0(n  2)  . 
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The  correction  term  in  (14)  [(15)  or  (16)]  is  positive  for  u  <  0. 
If  p  =  1,  the  correction  term  does  not  depend  on  A;  if  p  >  1,  the 
correction  term  decreases  with  A.  For  u  <  0,  the  correction  term 
increases  with  p. 

For  u  =  -  y  A  (which  is  not  c  =  0) 


(17) 


n  JW-a  „  A  |  (1)  .  .  N1  .. 
Pr< <  -  -  p=pv  ,  lim  —  =  1) 

W  “  2  ~  ~  n->00  N2 


-*(- f)+^(f)  + 


A 


+  0(n~2) 


4.  Numerical  values  of  the  correction  term  for  the  Studentized  W  when  N^  =  ^ 

We  can  obtain  an  idea  of  the  importance  of  the  term  of  order  1/n 
by  studying  numerical  values  of  it.  We  consider  the  second  term  in  (16), 
which  is  the  error  to  order  n  ^  of  using  <3>(u)  for  the  probability  of 
misclassif ication.  The  correction  relative  to  the  nominal  probability 
of  misclassif ication  is 


(18) 


1  ^(u) 

n  $(u) 


2-L  _ 

A 


(p  +  b  u  ~  f 


3  n 


Table  1  gives  values  of  the  term  in  brackets  for  the  five  values  of  u 
corresponding  to  values  of  $(u)  of  .1,  .05,  .025,  .01,  and  .005,  and 
various  values  of  p  and  A.  It  is  4.0893  for  u  =  -1.28155  [$(u)  =  .1], 
p  =  2,  and  A  =  2.  The  correction  relative  to  the  nominal  probability 
of  misclassif ication  is  the  value  in  the  table  multiplied  by  the  ratio 
c})(u)/$(u)  divided  by  n  =  -  2.  In  the  example  above  it  is 

4.0893  x  1.755  =  7.1767  divided  by  n.  If  ^  =  25,  then  n  =  48 

and  the  correction  relative  to  the  nominal  probability  of  misclassif ication 
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is  about  .15.  Here  the  correction  would  be  rather  small.  For  values 
of  and  somewhat  larger,  one  might  be  willing  to  neglect  the 

correction.  One  would  hope  that  for  these  values  of  and  ^  the 

error  when  using  this  correction  term  would  be  rather  small. 

We  might  also  be  interested  in  the  correction  at  u  =  -  —  A. 

Table  2  gives  the  information.  For  example,  for  A  =  4  $(-  ^  A)  =  .022750 
(which  would  be  the  minimax  probability  if  the  parameters  were  known)  and 
the  correction  is  the  appropriate  number  in  the  fourth  column  multiplied 
by  .053991  divided  by  n.  If  ^  =  25  and  p  =  2,  then  n  =  48 

and  the  correction  relative  to  the  nominal  probability  is  7  x  2.383/48 
=  .3475. 


5.  Comparison  of  the  expansions  of  the  distributions  of  W  and  the 
Student ized  W 


It  is  striking  that  the  asymptotic  expansion  of  the  distribution  of 
the  Studentized  W  is  much  simpler  than  that  of  W  itself  [the  comparison 
of  (6)  with  (14)  and  (7)  with  (16)],  except  for  the  particular  case  of 
u  =  -  ^  A  [(9)  with  (17)]  which  has  special  meaning  for  W  (c  =  0),  but 
not  for  the  Studentized  W. 

It  is  of  interest  to  compare  the  correction  terms  of  the  two 
asymptotic  expansions.  The  difference  is 


(19)  Pr  <■ 


W  2  3  i  (1)1  fa  2  a 

- <  u  y=yV  V  -  Pr  < - - —  <  u 


/a 


Soi 


y=y 


(1)1 


A 


A 


+  [  (1  +  j  k  +  j  i)  “f-k+'^'k+f-J  u 

+  1  a  +  b  I  +  AJ  u2  +  [2+k~9-/k  +  h  ^  +  0(n-2) 


2A 
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If  lim  N  /N  =  k  =  1,  the  expression  simplifies  to 

n-*»  l  2 


(20)  Pr 


IW  -  |  a 


t 


/a 


<  u 


N, 


w-|a 


y=y(1),  lim  ^  =  IV  -  Pr 

n-x”  2  1  I  v^a 


r 


<  u 


N1 

y=y  ,  lim  — 
~  n-*»  W2 


=  1 


+  i  +  A]“2  •  [?  +  f]u3j + 


0 (n~2 )  . 


In  particular,  for  u  =  -  A  the  difference  is 


|W  -  I  a  A 

(21)  Pr^ - —  <  -  | 

/a 


a)  Ni 

y-y  ,  lim  —■  >  -  Pr<(W  <  0 
n-*»  2 


b=ij(1),  1; 

~  ~  n-KO  2 


"  -  <Kf )  +  (?■  +  b  A  +  ir  A>  +  °(n  2)  • 


"4  8 


32 


Put  another  way,  the  correction  term  for  Pr{(W-a)/»/a  <-  ^  A}  is  twice 

the  correction  term  for  Pr{W  0}  plus  cj>(^"  A){A/8  +  A  /32}/n.  The 

latter  term,  which  does  not  depend  on  p,  is  usually  small;  values  of 
3 

A/8  +  A  /32  are  given  in  Table  3.  Comparison  with  Table  2  shews  that 
for  p  >  1  this  term  is  small  except  for  large  A.  Thus,  roughly  speaking, 
the  correction  for  the  Studentized  W  is  about  that  of  W  itself. 

Okamoto  (1963)  has  given  numerical  values  of  the  term  of  order  1/n 
and  the  term  of  order  1/n  in  the  expansion  of  Pr(W  _<  o|y=y^  for 
=  100  (n  =  198)  for  various  values  of  p  and  A.  His  values 
for  1/ri  are  about  twice  the  values  we  can  compute  from  Table  2.  In  his 

2 

table  for  small  values  of  p  and  A  the  ratio  of  the  term  of  order  1/n 

o 

to  the  term  of  order  1/n  is  very  roughly  1/n.  The  maximum  of  the  1/n 

term  over  A  increases  with  p.  At  p  =  7,  for  example,  it  is  about  .0008. 

The  table  suggests  that  for  small  or  moderate  values  of  p  the  second 

correction  term  can  be  safely  ignored  for  moderately  large  values  of 
and  ^  • 
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6.  Comparison  of  approximate  densities  and  moments 


some 


Corresponding  to  the  approximate  distributions  of  (W-oO/v'a  and 
(W-a)//a  (for  y=y^A  are  densities  and  moments.  It  is  of 
interest  to  compare  these. 

1  2 

The  approximate  density  of  (W  -  —  A  )/A  is 


(22)  <KuM1-^ 


(1+Ik+II)Ez3+3Ez2  +  ll+A: 

^  2  K  ^  2  k  2  +  2  k  +  4 


3  +  |p  k  -  |  (p-6)/k  p_5 


A  u 


+  f-1-V—  +  A  u3  +  (  .  2-„  ■■  2  k  +  lj  U4 


which  for  k  =  1  is 


(23)  <J)(u)  |  ^  [2 


£_|  +  &A  +  A_+^|_£_5Aju_  ^2^  +  ^2=1  +  |-)u2 


+  (f  +  A)  u3  +  (f2  +  X)  u4|  ' 

The  approximate  density  of  (I  -  |  a)//a  is 

(24)  (Hu)  <1  -  i  [p  -  i  +  i  k  +  <P-1>A<1+k>  U  -  (p  -  1  +  |  k)u2  -|^l|  , 


(24) 

<Ku) 

I1' 

which 

for 

k  = 

(25) 

<K  u) 

Hu)  |l  -  i  +  I  +  2  u  -  (p  -  i)  u2  - 
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1  2 

The  approximate  mean  of  (W  -  r  A  )/A  is 


(26) 


_1 

n 


6  +  |  pk  -  |  (p  -  12)/k  p  _  n 


A 


which  for  k  =  1  is 


(27) 


1 

n 


12  P--.ll  A 
A  2 


the  approximate  second-order  moment  is 


(28) 


1  +  — 
n 


1  r(2p-30)  (l+|k+|^) 


A" 


+  3p  +  26  -  i  +  i  A2 


which  for  k  =  1  is 


(29) 


1  +  -  4p  -~96Q  +  3p  -  25  +  \  A2 


The  approximate  mean  of  (W  -  y  a) //a  is 


(30) 


_  I  (p~l) (1+k) 


n 


which  for  k  =  1  is 


(31) 


n  A  ’ 


the  approximate  second-order  moment  is 


(32) 


1  +  ^  (2p  +  1  +  k)  , 


which  for  k  =  1  is 


(33) 


1  +  -  (2p  +  2)  . 

n  r 
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In  each  case  the  "approximate"  moment  is  the  moment  of  the  approxi¬ 
mate  density.  The  approximate  second-order  moment  is  also  the  approximate 
variance.  For  (W  -  y  a) //a  the  approximate  mean  is  negative  for  p  >  1 
(while  it  is  0  for  the  standard  normal  distribution) ;  its  numerical 
value  increases  with  p  and  decreases  with  A.  The  approximate  variances 
are  greater  than  1  (the  value  for  the  standard  normal  distribution) ; 
it  increases  with  p,  but  does  not  depend  on  A. 

7.  Achieving  a  given  probability  of  misclassif ication 

Suppose  one  wants  to  achieve  a  given  probability  p  of  misclassifi- 
cation  when  say.  How  should  one  choose  the  cut-off  point 

c  =  ut^a  +  ~  a  for  W  or  equivalently  u  for  (W  -  a) //a? 

Let  Uq  be  the  number  such  that  'Kuq)  =  P*  Then  the  probability 
of  misclassification  is 


(34) 


p  +-4>(u0) 


(p-1) (1+k)  _ 


(P  ~  T 


k)u. 


1  3 

"  4  uoJ 


0(n  2) 


The  correction  term  of  order  n  contains  the  unknown  parameter  A 
(if  p  >  1) .  However,  A  can  be  estimated  by  /a.  These  facts  suggest 
taking 


(35) 


1  F(p-l) (l+k)  ,  1  ,  1  ,  N  13' 

u  -  u0  -  n  \r~^ - (p  -  j  +  2  W  u0  -  J  u0j 


Then 


(36) 


<  u 


(1)1 

u=y  F  = 
J 


/E 


+ 


i 

n 


(p-1) (l+k) 

/E 


< 
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where 


(37) 


u*  =  u0  +  n  I(p  "i  +  Yk)  u0  +iu0] 


_2 

If  p  =  1,  this  probability  is  (14)  with  u  =  u*,  which  is  p+0(n  ) 

When  p  >  1,  we  calculate  the  probability  of  misclassif ication  as 


(38)  Pr{W  -  |  a  <  u*  -  7  (p-l)(l+k)}  =  Pr{  (x ^-x^  ’  S-1  (x-y) 


n 


<  u*  /Sa)-I(2))'s'1(xa)-J(2))  +  (x<1>-x<2>)' S-lS(1)-M)  -  i  <P 


-  * 


„*  /(x(1)-x<2))'s-1S<1)-x<2))  +  (x<1>-J(2))V1S<1)-U)  -  i 

~~~~~  ~  ~  ~  ~  ~  n 


/S<1)-J(2))'S-2(x<1>-x<2>) 


where  x^-x^,  x^-y  and  S  have  the  joint  distribution  given  in 

~  ~  <v  ~  ~ 

Anderson  (1972).  Then  the  expansion  of  $(  )  is 


(39)  ${u*  +  —  C*(Z,V)  +  —  D*(Y,Z,V)  +  rt  (Y,Z,V) 

&  ~  ~  n  ~  ~  ~  /n  ~ 


-  -  (p-l)(l+k)  (S'v.-SVfi)  +  r* (Y,Z,V) 

LA  A3  ~  1 . - 


n 


=  $(u*)  +  <K u*)  (—  C*(Z,V)  +  - 


D*(Y,Z,V) 


1  *7  1  ll 

-  |  u*  C  Z(Z,V)  -  ±  (p-l)(l+k)  f\ 


+  3  \/2  (p~l)  (1+k)  (6’Y-6’V6)1 
A  n 


+  “572  +12 

n  n 


+  rioJM>V  > 


-1) (l+k) } 

P-1) (l+k) 


r*(Y,Z,V) 
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where  C*(Z,V),  D*(Y,Z,V),  and  r*n(Y,Z,V)  are  C(Z,V),  D(Y,Z,V)  and 
r^n(Y,Z,V)  of  Anderson  (1972),  with  u  replaced  by  u*  and  r*(Y,Z,V) 
in  the  remainder  term  in  (19)  of  Anderson  (1972).  The  expected  value  of 
<M  )  is 

(40)  $(u*)  +  £  <Ku*)  [-  (p  ~  u*  -  \  u*3]  +  0(n“2) 
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